BigDFT.PostProcessing module

A module for post processing BigDFT calculations.

class BigDFTool(omp=1, mpi_run=None)[source]

Bases: object

This module defines the actions that can be performed using the bigdft-tool python script. Such a script sometimes invokes the memguess executable, which is executed serially. Sometimes the utilities main program is necessary to take care of more time consuming operations which require parallelism. We might also use the obabel wrapper provided by BigDFT.

Parameters
  • omp (int) – number of OpenMP threads. It defaults to the $OMP_NUM_THREADS variable in the environment, if present, otherwise it fixes the run to 1 thread.

  • mpi_run (str) – define the MPI command to be used. It defaults to the value $BIGDFT_MPIRUN of the environment, if present. When using this calculator into a job submission script, the value of $BIGDFT_MPIRUN variable may be set appropriately to launch the bigdft executable.

auto_fragment(system, view, cutoff, verbose=False, rand=False, criteria='bondorder')[source]

Refragment a system based on the results of a previous calculation.

The auto fragment protocol performs a greedy optimization using either the bond order or the distance between fragments as a guide. The purity values and the fragment bond orders are essential quantities for this calculation, so they can be passed if they are already cached.

By using the rand keyword, you can trigger a stochastic refragmentation process. If this process is repeated many times, you may find a result that improves upon the greedy optimization approach.

Parameters
  • system (BigDFT.Systems.System) – the system to fragment.

  • view (BigDFT.Systems.System) – a view of the system.

  • cutoff (float) – the termination criteria. When the worst fragment is more pure than this cutoff, the calculation stops.

  • verbose (bool) – whether to print out as it proceeds.

  • rand (bool) – whether to activate the stochastic approach.

  • criteria (string) – either distance or bondorder.

Returns

a mapping from the old system to the new system where each fragment fullfills the purity criteria.

Return type

dict

ccs_node(log, matrixname)[source]

Defines the protocol to validate the ccs matrix file

compute_fragment_dos(frag, log, ks_coeff, eigvals, frag_indices=None, smat=None, assume_pure=False, **kwargs)[source]

Compute the partial density of states projected onto a given fragment.

Parameters
  • sys (BigDFT.Fragments.Fragment) – the fragment to project on to.

  • log (BigDFT.Logfiles.Logfile) – the log of the calculation.

  • ks_coeff (scipy.sparse.csc_matrix) – the matrix of eigenvectors.

  • eigvals (list) – a list of eigenvalues.

  • frag_indices (list) – list of indices associated with this fragment.

  • smat (scipy.sparse.csc_matrix) – the overlap matrix.

  • assume_pure (bool) – an optimization can be performed if we assume the target is pure.

  • kwargs (dict) – any extra options to pass to the DoS constructor.

Returns

a density of states object built using the partial density of states.

Return type

(BigDFT.DoS.DoS)

create_layered_qmmm_system(system, target, pairwise_bo, cutoffs, criteria='bondorder', link_atoms=False)[source]

Creates a multilayered system suitable for QM/MM calculations. For each layer, a suitable QM region is built around it.

Parameters
  • system (BigDFT.Systems.System) – a System class, already broken up into fragments.

  • pairwise_bo (dict) – pairwise bond order values between all fragments in the system.

  • target (str) – the name of the fragment to treat as the target of the qm/mm run.

  • cutoffs (list) – a list of cutoff value for fragment interactions. The number of layers is equal to the number of cutoffs.

  • criteria (str) – how to determine which atoms are included in the QM region. Valid choices are “bondorder” and “distance”.

  • link_atoms (bool) – whether to generate link atoms.

Returns

a list of Systems, one for each QM layer. (System): the MM region.

Return type

(list)

create_qmmm_system(system, target, bond_order, cutoff, criteria='bondorder', link_atoms=False)[source]

Creates a system suitable for QM/MM calculations.

Parameters
  • system (System) – a System class, already broken up into fragments.

  • target (str) – the name of the fragment to treat as the target of the qm/mm run.

  • bond_order (dict) – bond order values between the target fragment and all other fragments in the system.

  • cutoff (float) – a cutoff value for fragment interactions.

  • criteria (str) – how to determine which atoms are included in the QM region. Valid choices are “bondorder” and “distance”.

  • link_atoms (bool) – whether to generate link atoms.

Returns

the QM region. (System): the MM region.

Return type

(System)

derived_quantity(name, log, files, generator)[source]

Defines a quantity that can be derived from the ccs files in the files list. Requires the generator function.

fragment_bond_order(sys, fraglist1, fraglist2, log, kxs=None, frag_indices=None)[source]

Computes “bond order” between two sets of fragments using the method of Mayer. For two atomic fragments, this would describe the bond multiplicity of the covalent bond.

Parameters
  • sys (BigDFT.Systems.System) – the system containing the fragments of interest

  • fraglist1 (list) – a list of fragments to compute the bond order of.

  • fraglist2 (list) – a list of fragments to compute the bond order between.

  • log (BigDFT.Logfiles.Logfile) – the log describing a calculation.

  • kxs (scipy.sparse.csc_matrix) – the matrix K*S, which might be already computed to reduce I/O time.

  • frag_indices (dict) – the matrix indices associated with each fragment.

Returns

a dictionary of dictionaries, mapping the bond order of each fragment in fraglist1 to each fragment in fraglist2.

Return type

(dict)

fragment_interaction_energy(sys, fraglist1, fraglist2, log, frag_indices=None, sinvh=None, kxs=None)[source]

Compute the interaction energies between two sets of fragments.

Parameters
  • fraglist1 (list) – a list of fragments to compute the interaction energy of.

  • fraglist2 (list) – a list of fragments to compute the interaction energy between.

  • log (BigDFT.Logfiles.Logfile) – the log describing a calculation.

  • frag_indices (dict) – the matrix indices associated with each fragment.

  • sinvh (scipy.sparse.csc_matrix) – the matrix S^{-1}*H, which might be already computed to reduce I/O time.

  • kxs (scipy.sparse.csc_matrix) – the matrix K*S, which might be already computed to reduce I/O time.

Returns

the projected energy.

Return type

(float)

fragment_mask_matrix(sys, mat, fragments, log)[source]

Sometimes we don’t want to store an entire matrix, just the parts related to some fragments of interest. This routine will mask out a matrix, keeping entries only related to the list of fragments provided.

Parameters
  • sys (BigDFT.Systems.System) – the system associated with the matrix.

  • mat (scipy.sparse.csr_matrix) – the matrix to mask.

  • fragments (list) – a list of fragment ids to keep.

  • log (BigDFT.Logfiles.Logfile) – the logfile associated with this matrix’s calculation.

Returns

the masked matrix.

Return type

(scipy.sparse.csr_matrix)

fragment_population(sys, log, frag_indices=None, kxs=None)[source]

Performs Mulliken population analysis on a fragment, in case charges haven’t been computed by doing a multipole analysis.

Parameters
  • sys (BigDFT.Systems.System) – the system to compute the population of.

  • log (BigDFT.Logfiles.Logfile) – the log describing a calculation.

  • frag_indices (dict) – the matrix indices associated with each fragment.

  • kxs (scipy.sparse.csc_matrix) – the matrix K*S, which might be already computed to reduce I/O time.

Returns

a mapping from fragment ids to charges.

Return type

(dict)

This routine adds link atoms to a subsystem based on the bond order of a full system. Link atom positions are automatically adjusted based on the length of some standard bonds.

Parameters
  • fullsys (BigDFT.Systems.System) – the full system that the subsystem is embedded into.

  • subsys (BigDFT.Systems.System) – the embedded system which needs link atoms.

  • distcut (float) – this cutoff is the largest distance value we expect allow a bond to be.

Returns

the subsystem with link atoms added. (BigDFT.Systems.System): a system which has the atoms that were

removed and replaced with link atoms.

Return type

(BigDFT.Systems.System)

get_frag_indices(sys, log)[source]

Compute a lookup table of matrix indices for each fragment in a system.

Parameters
  • system (System) – instance of a System class, which defines the fragmentation.

  • log (Logfile) – logfile from the run computed on this system.

Returns

a mapping of fragment ids to lists of indices

Return type

(dict)

get_ks_coeff(log)[source]

Retrieve the Kohn-Sham coefficients and the eigenvalues in matrix form.

log (Logfile): instance of a Logfile class

Returns

the matrix of coefficients (list): list of eigenvalues.

Return type

(scipy.sparse.csc_matrix)

get_matrix_h(log)[source]

Read the hamiltonian matrix into memory.

log (Logfile): instance of a Logfile class

Returns

the matrix H

Return type

(scipy.sparse.csc_matrix)

get_matrix_k(log)[source]

Read the density matrix into memory.

log (Logfile): instance of a Logfile class

Returns

the matrix K

Return type

(scipy.sparse.csc_matrix)

get_matrix_kxs(log)[source]

Computes the matrix K*S, the mulliken version of the density matrix, and loads it into memory.

log (Logfile): instance of a Logfile class

Returns

the matrix K*S

Return type

(scipy.sparse.csc_matrix)

get_matrix_s(log)[source]

Read the overlap matrix into memory.

log (Logfile): instance of a Logfile class

Returns

the matrix S

Return type

(scipy.sparse.csc_matrix)

get_matrix_sinvh(log)[source]

Computes the matrix S^{-1}*H, the mulliken version of the spillage matrix, and loads it into memory.

log (Logfile): instance of a Logfile class

Returns

the matrix S^{-1}*H

Return type

(scipy.sparse.csc_matrix)

run_compute_purity(system, log, kxs=None, frag_list=None, frag_indices=None)[source]

Compute the purity values of the different system fragments.

Note that this can also be computed using the fragment multipoles, but this provides an implementation for when you don’t need those values.

Parameters
  • system (System) – instance of a System class, which defines the fragmentation.

  • log (Logfile) – logfile from the run computed on this system.

  • kxs (scipy.sparse.csc_matrix) – the matrix K*S, which might be already computed to reduce I/O time.

  • frag_list (list) – we can also only compute the purity values of some select fragments.

  • frag_indices (list) – the indices of the matrix associated with each fragment. This can be precomputed and passed.

Returns

for each fragment id, what is the purity value.

Return type

(dict)

file_dependency_queue(btool, ccsfile, txtfile, mpifile=None)[source]
superunits_purities(bo, pv, q_units, superunits, q_superunits)[source]

Compute the purity values of the superunits described as a unification of the units

Parameters
  • bo (matrix-like) – Fragment Bond Orders of the Units

  • pv (array-like) – Purity values of the Units

  • superunits (dict) – lookup dictionary containing the list of units per superunit

  • q_units (array-like) – charges of the units

  • q_superunits (array-like) – charges of the superunits

Returns

purities of the superunits

Return type

dict

superunits_quadratic_quantities(bo, superunits)[source]

Compute quantities that transforms like the bond orders of the superunits described as a unification of the units

Parameters
  • bo (matrix-like) – Quantities like Fragment Bond Orders of the Units

  • superunits (dict) – lookup dictionary containing the list of units per superunit

Returns

quantities of the superunits

Return type

dict

system_violinplot(system_dict, ax=None, **kwargs)[source]

Represent a violinplot from a dictionary of values.

Parameters
  • system_dict (dict) – dictionary mapping the labels to plot with their values.

  • ax (matplotlib.pyplot.axis) – the axis to plot on.

  • **kwargs – any other argument to be passed to the violinplot

Returns

the axis of the plot and the object returned from

the violinplot.

Return type

tuple

systems_heatmap(data, restrict_to=None, axs=None, columns=None, **kwargs)[source]

Create a heatmap for a set of systems.

Parameters
  • data (dict) – a dictionary mapping system names to another dictionary which maps fragment ids to property values.

  • restrict_to (list) – a list of lists saying what fragments we should draw. It is a list of lists instead of just a list because this way we can put a separator between values (for example, when making a jump in the sequence).

  • axs (matplotlib.axis) – the axis to draw on.

  • columns (list) – the order of the columns on the x axis

  • kwargs (dict) – any extra argument you wish to pass to matplotlib’s imshow command.

Returns

a reference to the imshow value.

units_quadratic_traces(unitlist1, unitlist2, frag_indices, mat1, mat2, a1=1.0, a2=1.0)[source]

BigDFT.Spillage module

A module which contains the routines needed for computing the spillage values.

AU_to_A = 0.52917721092

Conversion between Atomic Units and Bohr

class MatrixMetadata(filename)[source]

Bases: object

This class contains the information stored in the sparse matrix metadata file.

Parameters

filename (str) – the name of the metadata file to initialize from.

atoms

a list of atoms with their positions and associated basis functions.

Type

list

matdim

the dimension of the matrix.

Type

int

get_frag_indices(system)[source]

Retrive the indices of the matrix associated with each fragment of a system.

Parameters

system (Systems.System) – the set of fragments to get the indices of.

Returns

a mapping from fragment to indices.

Return type

(dict)

invert_ccs(mat)[source]

Invert a CSC matrix and restore a sparsity pattern such as it can be written without the need for large files.

Parameters

mat (csc_matrix) – the sparese matrix to invert

Returns

the inverse

Return type

csc_matrix

read_ccs(fname)[source]

Read a CCS matrix from file into a scipy csc matrix.

This routine uses the CCS matrix files that CheSS generates. :param fname: name of the file. :type fname: str

Result:

scipy.sparse.csc_matrix: the matrix in the file.

serial_compute_puritybase(sfile, dfile)[source]

This routine computes the matrix K*S using python.

You will need this for computing the purity values.

Parameters
  • sfile (str) – the file name of the overlap matrix. Must be in ccs of mtx format.

  • dfile (str) – the file name of the density kernel. Must be in ccs of mtx format.

Returns

K*S

Return type

(scipy.sparse.csc, scipy.sparse.csc)

serial_compute_spillagebase(sfile, hfile)[source]

This routine computes the matrix S^{-1}*H using python.

You will need this for computing the spillage values.

Parameters
  • sfile (str) – the file name of the overlap matrix. Must be in ccs of mtx format.

  • hfile (str) – the file name of the hamiltonian. Must be in ccs of mtx format.

Returns

Sinverse * H

Return type

(scipy.sparse.csc, scipy.sparse.csc)

BigDFT.PostProcessing example

Below we show an example of using the PostProcessing module.

def _example():
    """
    Postprocessing Example
    """
    from BigDFT.Systems import System, FragmentView
    from BigDFT.Fragments import Fragment
    from BigDFT.IO import XYZReader
    from BigDFT.Calculators import SystemCalculator
    from BigDFT.Inputfiles import Inputfile
    from scipy.linalg import eigh
    from copy import deepcopy

    # Create a system
    sys = System()
    sys["FRA:0"] = Fragment()
    with XYZReader("CH2") as ifile:
        for at in ifile:
            sys["FRA:0"].append(at)
    sys["FRA:1"] = Fragment()
    with XYZReader("CH3") as ifile:
        for at in ifile:
            sys["FRA:1"].append(at)
    sys["FRA:1"].translate([0, 0, -3])
    sys["FRA:2"] = deepcopy(sys["FRA:0"])
    sys["FRA:2"].translate([0, 0, 3])
    sys["FRA:2"].rotate(y=150, units="degrees")
    sys["FRA:3"] = deepcopy(sys["FRA:0"])
    sys["FRA:3"].translate([3, 0, 1.5])

    print(list(sys))

    # Run a calculation. `write_support_function_matrices` and `linear` are
    # key. You also should be sure to set the atom multipoles.
    inp = Inputfile()
    inp.set_xc("PBE")
    inp.set_hgrid(0.4)
    inp.write_support_function_matrices()
    inp["import"] = "linear"

    code = SystemCalculator()
    code.update_global_options(verbose=False)
    log = code.run(input=inp, posinp=sys.get_posinp(), run_dir="work")
    sys.set_atom_multipoles(log)

    # Create the post processing tool.
    from BigDFT.PostProcessing import BigDFTool
    btool = BigDFTool()

    # Purity
    purity = btool.run_compute_purity(sys, log)
    print(purity)

    # Charges
    charges = {fragid: sum(at.nel for at in frag)
               for fragid, frag in sys.items()}

    # Bond Orders
    bo = btool.fragment_bond_order(sys, sys.keys(), sys.keys(), log)

    # Population values.
    population = btool.fragment_population(sys, log)
    print(population)

    # These three things define a fragment view.
    view = FragmentView(purity, bo, charges)

    # Auto Fragmentation
    mapping = btool.auto_fragment(sys, view, 0.10)
    print(mapping)

    # This defines a new view.
    new_view = view.refragment(mapping)
    print(new_view.purities)

    # Eigenvalues.
    H = btool.get_matrix_h(log)
    S = btool.get_matrix_s(log)
    w = eigh(H.todense(), b=S.todense(), eigvals_only=True)
    print(w)